A Statistical Phrase/Accent Model for Intonation Modeling

نویسندگان

  • Gopala Krishna Anumanchipalli
  • Luís C. Oliveira
  • Alan W. Black
چکیده

This paper proposes a statistical phrase/accent model of voice fundamental frequency(F0) for speech synthesis. It presents an approach for automatic extraction and modeling of phrase and accent phenomena from F0 contours by taking into account their overall trends in the training data. An iterative optimization algorithm is described to extract these components, minimizing the reconstruction error of the F0 contour. This method of modeling local and global components of F0 separately is shown to be better than conventional F0 models used in Statistical Parametric Speech Synthesis (SPSS). Perceptual evaluations confirm that the proposed model is significantly better than baseline SPSS F0 models in 3 prosodically diverse tasks – read speech, radio broadcast speech and audio book speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implications of Prosody Modeling for Prosody Recognition

This paper introduces Stem-ML, which is a model of the prosody generation process with an associated description language, and suggests how it may help prosody recognition. We applied Stem-ML modeling to three topics: the modeling of prosodic strengths, intonation types, and noun phrase patterns. Stem-ML parameters derived from )&* contours may have a more consistent relationship with prosodic ...

متن کامل

A system of stylized intonation contours in German

Modeling intonation, i.e., specifying adequate fundamental frequency (F0) contours, remains a challenging task for speech synthesis systems. This paper discusses the development of a system for phonetically specifying intonation contours for German. It deals with the problem of translating an abstract phonological representation of intonation namely the tone-sequence model into a concrete phone...

متن کامل

Inventory of intonation contours for text-to-speech synthesis

This paper presents an intonation model which determines intonation contours over intonation phrases. The model is described by four elements: communicative type of an intonation phrase; number of accent groups in it; position of the nuclear accent group in it; and set of target intonation points. Individualization of the model is based on semiautomatic analysis of speaker database. The model w...

متن کامل

Intonation recognition for indonesian speech based on fujisaki model

In this paper, we proposed to use the Fujisaki parameter to distinguish between declarative and interrogative intonation in Indonesian speech. Four combinations of Fujisaki parameter were selected as the features to distinguish between declarative and interrogative intonation. The first combination is only the amplitude of last accent command. The second combination consists of the amplitude of...

متن کامل

A quantitative description of German prosody offering symbolic labels as a by-product

The prosodic quality of a text-to-speech system is important for the intellegibility and perceived naturalness of synthetic speech. In earlier works the author developed a linguistically motivated model of German intonation based on the quantitative Fujisaki model of the production process of F0. The current paper compares results yielded by automatic Fujisaki modeling with a GToBI-style anotat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011